Skip to content

WIP: Fixed column name\order in iris dataset #408

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

OliaG
Copy link
Contributor

@OliaG OliaG commented Jun 25, 2018

per issue #400
Fixed the iris dataset

@OliaG
Copy link
Contributor Author

OliaG commented Jun 25, 2018

I have switched the order and the name of two columns. Not sure, why it affected acc metric. @Ivanidzo4ka , @codemzs is that expected?

@codemzs
Copy link
Member

codemzs commented Jun 25, 2018

Hi Olia,
public class IrisData
{
[Column("0")]
public float Label;

        [Column("1")]
        public float SepalLength;

        [Column("2")]
        public float SepalWidth;

        [Column("3")]
        public float PetalLength;

        [Column("4")]
        public float PetalWidth;
    }

You will need to update the order in which values are passed during prediction as well to ensure the ACC metrics are same. Do we really need to change the order in the dataset?


In reply to: 400119754 [](ancestors = 400119754)

@TomFinley
Copy link
Contributor

Hi thanks @OliaG ... incidentally could we perhaps make changes in forks of this repository, rather than in this repository itself?

@bojanmisic
Copy link
Contributor

bojanmisic commented Jun 26, 2018

I wanted to jump in and tackle this one as my first bug fix, but @OliaG it's yours :).

I believe that also tests (TrainAndPredictIrisModelTest and TrainAndPredictIrisModelWithStringLabelTest) should be modified to have more realistic values when testing prediction (since those are shifted there as well). This is where I have encountered some strange behavior: for values (SepalLength, SepalWidth, PetalLength, PetalWidth) = (4.4, 3.1, 2.5, 1.2) predictions when the label is string (TrainAndPredictIrisModelWithStringLabelTest) and when the label is numeric (TrainAndPredictIrisModelTest) were different dramatically - I still haven't figured out if I was causing it for some reason (the dataset modifications were done the same way you did).

Here is my repo's commit.

@shauheen
Copy link
Contributor

@OliaG as Tom mentioned it would be great if you would do the PR against a fork. Can we perhaps close this PR (remove the branch) and take a PR against @bojanmisic fork?

@OliaG OliaG changed the title Fixed column name\order in iris dataset WIP: Fixed column name\order in iris dataset Jun 26, 2018
@OliaG
Copy link
Contributor Author

OliaG commented Jun 26, 2018

@shauheen sure, let's do it. I'll close this PR and @bojanmisic can create his. Also we will need to update our samples repo: classification and clustering examples that are using the same dataset. @bojanmisic you're welcome to do that too or I can do it.

@OliaG OliaG closed this Jun 26, 2018
@OliaG OliaG deleted the irisdata branch June 26, 2018 17:23
@bojanmisic
Copy link
Contributor

@OliaG, PR created: #428. The fix for samples repo will follow. Thanks!

@ghost ghost locked as resolved and limited conversation to collaborators Mar 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants